--- title: "Homework 8 R Exercise" output: pdf_document: default html_document: default --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` We will study power and robustness with a couple simulation exercises. In the first exercise, we will compare the power of a two-sample pooled t test to the Wilcoxon rank sum test. To start, we will generate 1000 sets of two independent *normal* random samples with different means but equal variances (the t-test's home turf), and compare the proportion of times the two tests reject $H_0$. The Wilcoxon tests take a little time to carry out; do not be concerned if your results are not generated right away. Which test had the highest power (proportion of rejections) for you? ```{r eval=FALSE} #Generate 1000 samples of size 10 for two different #groups, with different means, but the same variance X1_values=matrix(rnorm(10000,mean=1,sd=1),ncol=10) X2_values=matrix(rnorm(10000,mean=0,sd=1),ncol=10) #Run pooled t-tests and rank sum tests on each of 1000 samples TTest=apply(X1_values,1,t.test,y=X2_values,var.equal=TRUE) WTest=apply(X1_values,1,wilcox.test,y=X2_values) #Store t-test p-values and Wilcoxon p-values T_pvalues=as.numeric(unlist(TTest)[names(unlist(TTest))=="p.value"]) W_pvalues=as.numeric(unlist(WTest)[names(unlist(WTest))=="p.value"]) #Proportion of cases out of 1000 for which we rejected H_0. sum(T_pvalues<0.05)/1000 sum(W_pvalues<0.05)/1000 ``` We can repeat the same exercise, but replace the normal random samples with Cauchy random samples. The Cauchy distribution is a heavy-tailed distribution for which the t-test will behave poorly. Note that the Cauchy distribution does not have a mean and standard deviation, but does have location and scale parameters, which we specify instead. Which test had the higher power here? ```{r eval=FALSE} #Generate 1000 samples of size 10 for two different #groups, with different means, but the same variance X1_values=matrix(rcauchy(10000,location=1,scale=1),ncol=10) X2_values=matrix(rcauchy(10000,location=0,scale=1),ncol=10) #Run pooled t-tests and rank sum tests on each of 1000 samples TTest=apply(X1_values,1,t.test,y=X2_values,var.equal=TRUE) WTest=apply(X1_values,1,wilcox.test,y=X2_values) #Store t-test p-values and Wilcoxon p-values T_pvalues=as.numeric(unlist(TTest)[names(unlist(TTest))=="p.value"]) W_pvalues=as.numeric(unlist(WTest)[names(unlist(WTest))=="p.value"]) #Count number of cases out of 1000 for which we rejected H_0. sum(T_pvalues<0.05)/1000 sum(W_pvalues<0.05)/1000 ```